Discriminative Density-ratio Estimation
نویسندگان
چکیده
The covariate shift is a challenging problem in supervised learning that results from the discrepancy between the training and test distributions. An effective approach which recently drew a considerable attention in the research community is to reweight the training samples to minimize that discrepancy. In specific, many methods are based on developing Density-ratio (DR) estimation techniques that apply to both regression and classification problems. Although these methods work well for regression problems, their performance on classification problems is not satisfactory. This is due to a key observation that these methods focus on matching the sample marginal distributions without paying attention to preserving the separation between classes in the reweighted space. In this paper, we propose a novel method for Discriminative Density-ratio (DDR) estimation that addresses the aforementioned problem and aims at estimating the density-ratio of joint distributions in a class-wise manner. The proposed algorithm is an iterative procedure that alternates between estimating the class information for the test data and estimating new density ratio for each class. To incorporate the estimated class information of the test data, a soft matching technique is proposed. In addition, we employ an effective criterion which adopts mutual information as an indicator to stop the iterative procedure while resulting in a decision boundary that lies in a sparse region. Experiments on synthetic and benchmark datasets demonstrate the superiority of the proposed method in terms of both accuracy and robustness.
منابع مشابه
Density Ratio Hidden Markov Models
Hidden Markov models and their variants are the predominant sequential classification method in such domains as speech recognition, bioinformatics and natural language processing. Being generative rather than discriminative models, however, their classification performance is a drawback. In this paper we apply ideas from the field of density ratio estimation to bypass the difficult step of lear...
متن کاملDiscriminative Training of Subspace Gaussian Mixture Model for Pattern Classification
The Gaussian mixture model (GMM) has been widely used in pattern recognition problems for clustering and probability density estimation. For pattern classification, however, the GMM has to consider two issues: model structure in high-dimensional space and discriminative training for optimizing the decision boundary. In this paper, we propose a classification method using subspace GMM density mo...
متن کاملDiscriminative Mixture Models
We consider the problem of learning density mixture models for Classification. Traditional learning of mixtures for density estimation focuses on models that correctly represent the density at all points in the sample space. Discriminative learning, on the other hand, aims at representing the density at the decision boundary. We introduce novel discriminative learning methods for mixtures of ge...
متن کاملVocabulary independent discriminative term frequency estimation
We introduce a discriminative approach to vocabulary independent term frequency estimation. Using two separate corpora and recognition systems, we show that our model can perform significantly better than a previously established generative model at this task.
متن کاملKernel Expansions with Unlabeled Examples
Modern classification applications necessitate supplementing the few available labeled examples with unlabeled examples to improve classification performance. We present a new tractable algorithm for exploiting unlabeled examples in discriminative classification. This is achieved essentially by expanding the input vectors into longer feature vectors via both labeled and unlabeled examples. The ...
متن کامل